AITopics | main agent

Collaborating Authors

main agent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LLM Collaboration With Multi-Agent Reinforcement Learning

Liu, Shuo, Chen, Tianle, Liang, Zeyu, Lyu, Xueguang, Amato, Christopher

arXiv.org Artificial IntelligenceDec-10-2025

A large amount of work has been done in Multi-Agent Systems (MAS) for modeling and solving problems with multiple interacting agents. However, most LLMs are pretrained independently and not specifically optimized for coordination. For example, existing LLM fine-tuning frameworks rely on individual rewards, which require complex reward designs for each agent to encourage collaboration. To address this challenge, we model LLM collaboration as a cooperative Multi-Agent Reinforcement Learning (MARL) problem. We develop a multi-agent, multi-turn algorithm, Multi-Agent Group Relative Policy Optimization (MAGRPO), to solve it, building on current RL approaches for LLMs as well as MARL techniques. Our experiments on LLM writing and coding collaboration demonstrate that fine-tuning multiple LLMs with MAGRPO enables agents to generate high-quality responses efficiently through effective cooperation. Our approach opens the door to using MARL methods for LLM collaboration and highlights the associated challenges.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.04652

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Hong, Haoyang, Yin, Jiajun, Wang, Yuan, Liu, Jingnan, Chen, Zhe, Yu, Ailing, Li, Ji, Ye, Zhiling, Xiao, Hansong, Chen, Yefei, Zhou, Hualei, Yue, Yun, Yang, Minghui, Guo, Chunxiao, Liu, Junwei, Wei, Peng, Gu, Jinjie

arXiv.org Artificial IntelligenceNov-19-2025

Multi-agent systems perform well on general reasoning tasks. However, the lack of training in specialized areas hinders their accuracy. Current training methods train a unified large language model (LLM) for all agents in the system. This may limit the performances due to different distributions underlying for different agents. Therefore, training multi-agent systems with distinct LLMs should be the next step to solve. However, this approach introduces optimization challenges. For example, agents operate at different frequencies, rollouts involve varying sub-agent invocations, and agents are often deployed across separate servers, disrupting end-to-end gradient flow. To address these issues, we propose M-GRPO, a hierarchical extension of Group Relative Policy Optimization designed for vertical Multi-agent systems with a main agent (planner) and multiple sub-agents (multi-turn tool executors). M-GRPO computes group-relative advantages for both main and sub-agents, maintaining hierarchical credit assignment. It also introduces a trajectory-alignment scheme that generates fixed-size batches despite variable sub-agent invocations. We deploy a decoupled training pipeline in which agents run on separate servers and exchange minimal statistics via a shared store. This enables scalable training without cross-server backpropagation. In experiments on real-world benchmarks (e.g., GAIA, XBench-DeepSearch, and WebWalkerQA), M-GRPO consistently outperforms both single-agent GRPO and multi-agent GRPO with frozen sub-agents, demonstrating improved stability and sample efficiency. These results show that aligning heterogeneous trajectories and decoupling optimization across specialized agents enhances tool-augmented reasoning tasks.

agent, artificial intelligence, arxiv, (18 more...)

arXiv.org Artificial Intelligence

2511.13288

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Efficient Reinforcement Learning for Zero-Shot Coordination in Evolving Games

Hui, Bingyu, Yu, Lebin, Yao, Quanming, Qu, Yunpeng, Zhang, Xudong, Wang, Jian

arXiv.org Artificial IntelligenceNov-19-2025

Zero-shot coordination(ZSC), a key challenge in multi-agent game theory, has become a hot topic in reinforcement learning (RL) research recently, especially in complex evolving games. It focuses on the generalization ability of agents, requiring them to coordinate well with collaborators from a diverse, potentially evolving, pool of partners that are not seen before without any fine-tuning. Population-based training, which approximates such an evolving partner pool, has been proven to provide good zero-shot coordination performance; nevertheless, existing methods are limited by computational resources, mainly focusing on optimizing diversity in small populations while neglecting the potential performance gains from scaling population size. To address this issue, this paper proposes the Scalable Population Training (ScaPT), an efficient RL training framework comprising two key components: a meta-agent that efficiently realizes a population by selectively sharing parameters across agents, and a mutual information regularizer that guarantees population diversity. To empirically validate the effectiveness of ScaPT, this paper evaluates it along with representational frameworks in Han-abi cooperative game and confirms its superiority.

large language model, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2511.11083

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.66)
Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.84)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)

Add feedback

COMPASS: Enhancing Agent Long-Horizon Reasoning with Evolving Context

Wan, Guangya, Ling, Mingyang, Ren, Xiaoqi, Han, Rujun, Li, Sheng, Zhang, Zizhao

arXiv.org Artificial IntelligenceOct-13-2025

Long-horizon tasks that require sustained reasoning and multiple tool interactions remain challenging for LLM agents: small errors compound across steps, and even state-of-the-art models often hallucinate or lose coherence. We identify context management as the central bottleneck -- extended histories cause agents to overlook critical evidence or become distracted by irrelevant information, thus failing to replan or reflect from previous mistakes. To address this, we propose COMPASS (Context-Organized Multi-Agent Planning and Strategy System), a lightweight hierarchical framework that separates tactical execution, strategic oversight, and context organization into three specialized components: (1) a Main Agent that performs reasoning and tool use, (2) a Meta-Thinker that monitors progress and issues strategic interventions, and (3) a Context Manager that maintains concise, relevant progress briefs for different reasoning stages. Across three challenging benchmarks -- GAIA, BrowseComp, and Humanity's Last Exam -- COMPASS improves accuracy by up to 20% relative to both single- and multi-agent baselines. We further introduce a test-time scaling extension that elevates performance to match established DeepResearch agents, and a post-training pipeline that delegates context management to smaller models for enhanced efficiency.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.0879

Country: North America > United States (0.68)

Genre:

Workflow (1.00)
Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Sports > Soccer (0.94)
Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

LLM-based Multi-Agent Blackboard System for Information Discovery in Data Science

Salemi, Alireza, Parmar, Mihir, Goyal, Palash, Song, Yiwen, Yoon, Jinsung, Zamani, Hamed, Palangi, Hamid, Pfister, Tomas

arXiv.org Artificial IntelligenceOct-3-2025

The rapid advancement of Large Language Models (LLMs) has opened new opportunities in data science, yet their practical deployment is often constrained by the challenge of discovering relevant data within large heterogeneous data lakes. Existing methods struggle with this: single-agent systems are quickly overwhelmed by large, heterogeneous files in the large data lakes, while multi-agent systems designed based on a master-slave paradigm depend on a rigid central controller for task allocation that requires precise knowledge of each sub-agent's capabilities. To address these limitations, we propose a novel multi-agent communication paradigm inspired by the blackboard architecture for traditional AI models. In this framework, a central agent posts requests to a shared blackboard, and autonomous subordinate agents -- either responsible for a partition of the data lake or general information retrieval -- volunteer to respond based on their capabilities. This design improves scalability and flexibility by eliminating the need for a central coordinator to have prior knowledge of all sub-agents' expertise. We evaluate our method on three benchmarks that require explicit data discovery: KramaBench and modified versions of DS-Bench and DA-Code to incorporate data discovery. Experimental results demonstrate that the blackboard architecture substantially outperforms baselines, including RAG and the master-slave multi-agent paradigm, achieving between 13% to 57% relative improvement in end-to-end task success and up to a 9% relative gain in F1 score for data discovery over the best-performing baselines across both proprietary and open-source LLMs. Our findings establish the blackboard paradigm as a scalable and generalizable communication framework for multi-agent systems.

agent, artificial intelligence, llm-based multi-agent blackboard system, (11 more...)

arXiv.org Artificial Intelligence

2510.01285

Country: North America > United States (0.93)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.46)
Health & Medicine (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Gesture Evaluation in Virtual Reality

Werner, Axel Wiebe, Beskow, Jonas, Deichler, Anna

arXiv.org Artificial IntelligenceSep-17-2025

In the realm of human communication, gestures serve as an integral component, facilitating nonverbal expression and enhancing the richness of interpersonal interactions Studdert-Kennedy [1994]. Gestures can convey many different types of information between speaker and interlocutor, ranging from clear communication of intent, e.g., the thumbs up gesture Kendon [1997], to ambiguous and non-codified gestures which may convey some subconscious thought Goldin-Meadow [1999] or emotional state Krauss et al. [1996]. A burgeoning technology that seeks to utilize the communicative quality of gesticulation is the field of Embodied Conversational Agents (ECAs) Louwerse et al. [2009]Cassell [2000], in which the study of gestural communication gains prominence as researchers seek to produce AIgenerated gestures to increase the life-likeness and communicative qualities of ECAs Kucherenko et al. [2023]. So far, the evaluations of these generated gestures have mainly been conducted in 2D settings.

artificial intelligence, human computer interaction, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3686215.3688821

2509.12816

Country: North America > United States (0.69)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Fang, Tianqing, Zhang, Zhisong, Wang, Xiaoyang, Wang, Rui, Qin, Can, Wan, Yuxuan, Ma, Jun-Yu, Zhang, Ce, Chen, Jiaqi, Li, Xiyun, Zhang, Hongming, Mi, Haitao, Yu, Dong

arXiv.org Artificial IntelligenceAug-13-2025

General AI Agents are increasingly recognized as foundational frameworks for the next generation of artificial intelligence, enabling complex reasoning, web interaction, coding, and autonomous research capabilities. However, current agent systems are either closed-source or heavily reliant on a variety of paid APIs and proprietary tools, limiting accessibility and reproducibility for the research community. In this work, we present \textbf{Cognitive Kernel-Pro}, a fully open-source and (to the maximum extent) free multi-module agent framework designed to democratize the development and evaluation of advanced AI agents. Within Cognitive Kernel-Pro, we systematically investigate the curation of high-quality training data for Agent Foundation Models, focusing on the construction of queries, trajectories, and verifiable answers across four key domains: web, file, code, and general reasoning. Furthermore, we explore novel strategies for agent test-time reflection and voting to enhance agent robustness and performance. We evaluate Cognitive Kernel-Pro on GAIA, achieving state-of-the-art results among open-source and free agents. Notably, our 8B-parameter open-source model surpasses previous leading systems such as WebDancer and WebSailor, establishing a new performance standard for accessible, high-capability AI agents. Code is available at https://github.com/Tencent/CognitiveKernel-Pro

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.00414

Country:

Asia (0.67)
Europe > Austria (0.28)

Genre:

Research Report (1.00)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

A Self-Improving Coding Agent

Robeyns, Maxime, Szummer, Martin, Aitchison, Laurence

arXiv.org Artificial IntelligenceMay-20-2025

Recent advancements in Large Language Models (LLMs) have spurred interest in deploying LLM agents to undertake tasks in the world. LLMs are often deployed in agent systems: code that orchestrates LLM calls and provides them with tools. We demonstrate that an agent system, equipped with basic coding tools, can autonomously edit itself, and thereby improve its performance on benchmark tasks. We find performance gains from 17% to 53% on a random subset of SWE Bench Verified, with additional performance gains on LiveCodeBench, as well as synthetically generated agent benchmarks. Our work represents an advancement in the automated and open-ended design of agentic systems, and demonstrates a data-efficient, non gradient-based learning mechanism driven by LLM reflection and code updates.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.15228

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Towards Adaptive Software Agents for Debugging

Majdoub, Yacine, Charrada, Eya Ben, Touati, Haifa

arXiv.org Artificial IntelligenceApr-28-2025

Using multiple agents was found to improve the debugging capabilities of Large Language Models. However, increasing the number of LLM-agents has several drawbacks such as increasing the running costs and rising the risk for the agents to lose focus. In this work, we propose an adaptive agentic design, where the number of agents and their roles are determined dynamically based on the characteristics of the task to be achieved. In this design, the agents roles are not predefined, but are generated after analyzing the problem to be solved. Our initial evaluation shows that, with the adaptive design, the number of agents that are generated depends on the complexity of the buggy code. In fact, for simple code with mere syntax issues, the problem was usually fixed using one agent only. However, for more complex problems, we noticed the creation of a higher number of agents. Regarding the effectiveness of the fix, we noticed an average improvement of 11% compared to the one-shot prompting. Given these promising results, we outline future research directions to improve our design for adaptive software agents that can autonomously plan and conduct their software goals.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2504.18316

Country:

Europe > Norway (0.16)
Africa > Middle East > Tunisia (0.15)

Genre:

Research Report (0.68)
Workflow (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Xu, Tianqi, Chen, Linyao, Wu, Dai-Jie, Chen, Yanjun, Zhang, Zecheng, Yao, Xiang, Xie, Zhiqiang, Chen, Yongchao, Liu, Shilong, Qian, Bochen, Torr, Philip, Ghanem, Bernard, Li, Guohao

arXiv.org Artificial IntelligenceJul-1-2024

The development of autonomous agents increasingly relies on Multimodal Language Models (MLMs) to perform tasks described in natural language with GUI environments, such as websites, desktop computers, or mobile phones. Existing benchmarks for MLM agents in interactive environments are limited by their focus on a single environment, lack of detailed and generalized evaluation methods, and the complexities of constructing tasks and evaluators. To overcome these limitations, we introduce Crab, the first agent benchmark framework designed to support cross-environment tasks, incorporating a graph-based fine-grained evaluation method and an efficient mechanism for task and evaluator construction. Our framework supports multiple devices and can be easily extended to any environment with a Python interface. Leveraging Crab, we developed a cross-platform Crab Benchmark-v0 comprising 100 tasks in computer desktop and mobile phone environments. We evaluated four advanced MLMs using different single and multi-agent system configurations on this benchmark. The experimental results demonstrate that the single agent with GPT-4o achieves the best completion ratio of 35.26%. All framework code, agent code, and task datasets are publicly available at https://github.com/camel-ai/crab.

agent, application, evaluator, (13 more...)

arXiv.org Artificial Intelligence

2407.01511

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
North America > Canada (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)

Industry:

Information Technology > Software (0.49)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback